library(dplyr)
library(readxl)
library(tidygeocoder)
library(leaflet)
library(tidyr)
library(knitr)
library(ggplot2)
library(htmltools)
library(tidytext)
library(readr)
library(stringr)
library(forcats)
library (plotly)
library(writexl)PM 566 Midterm - California Supplier Diversity and Net Income
Background and Research Question
To accelerate efforts to reduce health care disparities, hospitals and health systems increase their efforts in core areas such as staff and leadership diversity and cultural competence. However, the economic relationship between healthcare systems and the communities they serve are growing in importance.
Supplier diversity refers to when an organization procures goods and services from a variety of businesses, including those that are at least 51% owned, managed, and operated from marginalized and minority groups. These include women, veterans, African Americans, LGBTQIA+, and more. The private sector, including the healthcare industry has incorporated supplier diversity programs into their business practices after historically being adopted by the federal government and its contractors. According to the Harvard Business Review, supplier diversity programs are important in combatting social injustice and systemic racism in the US as they actively include diverse representation and inclusion in hospital operations and supply chains. In addition, for the moral and ethical arguments, supplier diversity programs have commercial in hospitals and health systems. These benefits include greater innovation and value through cost reductions, expansions of external partnerships, local job creation, better understanding of supply chain sourcing process and sources, and easier compliance with government and grant contracts.
On an annual basis (individual hospital fiscal year), individual hospitals and systems report detailed facility level financial data to the Department of Health Care Access and Information (HCAI). This data includes detailed facility level data on services capacity, inpatient/outpatient utilization, revenues, and expenses by type and payer. In addition, the Health and Safety Code Section 1339.85-1339.87 requires individual hospitals with operating expenses over $50 million to report hospital supplier and diversity reports explaining the hospitals’ supplier diversity statement and procurement efforts regarding minority, women, LGBT, and disabled veteran enterprises.
This report merges the annual financial data and supplier diversity reports for 2023 to answer the question Are California hospitals with diverse suppliers profitable? Supplier diversity aims to increase innovation and drive down prices for supplies and goods though competition while also aiming to improve health equity and combat social injustice in the US through business practices. This exploratory data analysis aims to see if funds dedicated to goods and services from diversely owned businesses can lead to better financial outcomes.
Methods
A novel dataset was collected by merging two data sets, HCAI’s Hospital Annual Financial Disclosure Report for 2023, and HCAI’s Supplier Diversity report from 2023. Required by state law, supplier diversity and financial data are reported each year to the HCAI. Datasets were merged on the shared hospital name variable to create a comprehensive data containing both supplier diversity and financial metrics for each hospital.
Following the merge, several variables indicating procurement from diverse backgrounds were recoded to be used as numeric variables, allowing for quantitative analysis. Address related variables were recoded into latitude and longitude variables to facilitate geocoding using the tidygeocoder package for recoding and the leaflet package for visualization.
Frequency tables were generated to view top hospitals in net income, supplier procurement from different backgrounds, and demographic data. Maps were generated to locate the top performing hospitals and correlations were run to quantify the relationship between supplier procurement and hospital net income.
hospital_suppliers <- read_excel("supplier-diversity-report-2023-extract-.xlsx")
hospital_finances <- read_excel("hadr-2023.xlsx")
hospitals <- merge(hospital_suppliers, hospital_finances, by = "Hospital_Name", all.x = TRUE)
final_hospitals <- hospitals %>%
select(Hospital_Name, Hospital_Address, Type_Control, County, MSSA, Supplier_Diversity_Statement, Encourage_Suppliers, Encourage_Employees, Conduct_Outreach_Comm, Certification_Support, Tier_I_African_American, Tier_II_African_American, Total_African_American, Tier_I_Hispanic_American, Tier_II_Hispanic_American, Total_Hispanic_American, Tier_I_Native_American, Tier_II_Native_American, Total_Native_American, Tier_I_Asian_Pacific_American, Tier_II_Asian_Pacific_American, Total_Asian_Pacific_American, Tier_I_Unknown_Minority, Tier_II_Unknown_Minority, Total_Unknown_Minority, Total_Tier_I_Minority, Total_Tier_II_Minority, Total_Minority, Tier_I_Women, Tier_II_Women, Total_Women, Tier_I_LGBT, Tier_II_LGBT, Total_LGBT, Tier_I_Disabled_Veteran, Tier_II_Disabled_Veteran, Total_Disabled_Veteran, Tier_I_Less_Duplicated_Amount, Tier_II_Less_Duplicated_Amount, Total_Less_Duplicated_Amount, Combined_Tier_I_Total, Combined_Tier_II_Total, Combined_Total, Total_Hospital_Procurement, ADDRESS, CITY, ZIP_CODE, GR_PT_REV, DED_FR_REV, TOT_CAP_REV, NET_PT_REV, OTH_OP_REV, TOT_OP_EXP, NET_FRM_OP, NONOP_REV, NONOP_EXP, INC_TAX, EXT_ITEM, NET_INCOME, EXP_SAL, EXP_BEN, EXP_PHYS, EXP_OTHPRO, EXP_SUPP, EXP_PURCH, EXP_DEPRE, EXP_LEASES, EXP_INSUR, EXP_INTRST, EXP_OTH
) final_hospitals <- final_hospitals %>%
unite("full_address", ADDRESS, CITY, ZIP_CODE, sep = ", ", remove = FALSE) %>%
geocode(address = full_address, method = "osm", lat = latitude, long = longitude)final_hospitals <- final_hospitals %>%
mutate(
Combined_Total = as.numeric(as.character(Combined_Total)),
Total_Hospital_Procurement = as.numeric(as.character(Total_Hospital_Procurement)),
Total_Minority = as.numeric(as.character(Total_Minority))
)
final_hospitals <- final_hospitals %>%
mutate(across(c(Tier_I_African_American, Tier_II_African_American, Total_African_American,
Tier_I_Hispanic_American, Tier_II_Hispanic_American, Total_Hispanic_American,
Tier_I_Native_American, Tier_II_Native_American, Total_Native_American,
Tier_I_Asian_Pacific_American, Tier_II_Asian_Pacific_American, Total_Asian_Pacific_American,
Tier_I_Unknown_Minority, Tier_II_Unknown_Minority, Total_Unknown_Minority,
Total_Tier_I_Minority, Total_Tier_II_Minority, Total_Minority,
Tier_I_Women, Tier_II_Women, Total_Women,
Tier_I_LGBT, Tier_II_LGBT, Total_LGBT,
Tier_I_Disabled_Veteran, Tier_II_Disabled_Veteran, Total_Disabled_Veteran,
Tier_I_Less_Duplicated_Amount, Tier_II_Less_Duplicated_Amount, Total_Less_Duplicated_Amount,
Combined_Tier_I_Total, Combined_Tier_II_Total, Combined_Total, Total_Hospital_Procurement),
as.numeric))Demographics
In this report there are 372 hospitals with both financial and supplier diversity data reported the to the HCAI in 2023. On average these hospitals report earning over 24 million dollars over the course of the year and spend over $9.6 million on supplies from diverse suppliers, totaling about 9 percent of the total dollars spent on the procurement on supplies annually.
A majority of these hospitals (320) serve urban areas in California, as demonstrated by large clusters of hospitals around major cities such as San Francisco, San Diego, and Los Angeles. Over 53% of the hospitals are nonprofit, including church related facilities.
leaflet(data = final_hospitals) %>%
addTiles() %>%
addCircleMarkers(~longitude, ~latitude,
popup = ~Hospital_Name, # Display hospital name on click
radius = 5, color = "blue", fill = TRUE, fillOpacity = 0.7) %>%
setView(lng = -119.4179, lat = 36.7783, zoom = 6) # Center on Californiasummary_table <- final_hospitals %>%
summarize(
`Total Hospitals` = n(),
`Average Procurement from Diverse Suppliers` = mean(Combined_Total, na.rm = TRUE),
`Average Total Hospital Procurement` = mean(Total_Hospital_Procurement, na.rm = TRUE),
`Average Net Income` = mean(NET_INCOME, na.rm = TRUE)
)
kable(summary_table, caption = "Summary Table of Hospital Data")| Total Hospitals | Average Procurement from Diverse Suppliers | Average Total Hospital Procurement | Average Net Income |
|---|---|---|---|
| 372 | 9577526 | 115754584 | 24413739 |
urban_rural_freq <- final_hospitals %>%
group_by(MSSA) %>%
summarize(Frequency = n()) %>%
ungroup()
type_control_freq <- final_hospitals %>%
group_by(Type_Control) %>%
summarize(Frequency = n()) %>%
ungroup()
print(kable(urban_rural_freq, caption = "Frequencies of Urban/Rural Hospitals"))
Table: Frequencies of Urban/Rural Hospitals
|MSSA | Frequency|
|:-----|---------:|
|Rural | 52|
|Urban | 320|
print(kable(type_control_freq, caption = "Frequencies of Hospital Types (Type_Control)"))
Table: Frequencies of Hospital Types (Type_Control)
|Type_Control | Frequency|
|:---------------------------------------------|---------:|
|City or County | 23|
|District | 17|
|Investor - Corporation | 50|
|Investor - Limited Liability Company | 57|
|Investor - Partnership | 11|
|Non-profit Corporation (incl. Church-related) | 198|
|State | 6|
|University of California | 10|
How much are hosptials spending on supplies from minority owned businessess?
Washington Hospital in Fremont spends in the most on minority owned suppliers, followed by Stanford Health Care. In specific category, Kaiser Permanente in Santa Clara spends the most on African American suppliers while Stanford leads for Hispanic and Asian/Pacific Category. Washington Hospital in Fremont also is the top performer in the unknown minority category while, such as Kaiser Foundation Hospital - San Diego and Contra Costa Regional Medical Center, lead in categories like total women-owned and LGBT-owned suppliers, respectively.
# Get the top diverse hospitals and select only the relevant columns
top_diverse_hospitals <- final_hospitals %>%
arrange(desc(Total_Minority)) %>%
select(Hospital_Name, Total_Minority) %>%
head(n = 10)
kable(top_diverse_hospitals, col.names = c("Hospital Name", "Combined Total Spent on Minority Owned Suppliers"))| Hospital Name | Combined Total Spent on Minority Owned Suppliers |
|---|---|
| WASHINGTON HOSPITAL - FREMONT | 265276375 |
| STANFORD HEALTH CARE | 113963711 |
| CHILDREN’S HOSPITAL OF ORANGE COUNTY | 66667480 |
| UCSF MEDICAL CENTER | 54596986 |
| KAISER FOUNDATION HOSPITAL - SAN DIEGO - CLAIREMONT MESA | 50859953 |
| KAISER FOUNDATION HOSPITAL - DOWNEY | 46252720 |
| KAISER FOUNDATION HOSPITAL - SANTA CLARA | 45552873 |
| KAISER FOUNDATION HOSPITAL - RIVERSIDE | 38359794 |
| KAISER FOUNDATION HOSPITAL - LOS ANGELES | 37618486 |
| CEDARS-SINAI MEDICAL CENTER | 35384829 |
selected_vars <- c(
"Total_African_American", "Total_Hispanic_American",
"Total_Native_American", "Total_Asian_Pacific_American",
"Total_Unknown_Minority", "Total_Minority",
"Total_Women", "Total_LGBT", "Total_Disabled_Veteran"
)
top_hospitals_df <- data.frame(Top_Hospital = character(), Top_Value = numeric(), Category = character(), stringsAsFactors = FALSE)
# Loop through each variable to find the top hospital
for (var_name in selected_vars) {
top_hospital <- final_hospitals %>%
filter(!is.na(!!sym(var_name))) %>% # Exclude NA values for the variable
top_n(1, !!sym(var_name)) %>% # Get the hospital with the highest value
select(Hospital_Name, !!sym(var_name)) %>% # Select relevant columns
rename(Top_Hospital = Hospital_Name, Top_Value = !!sym(var_name)) %>% # Rename for clarity
mutate(Category = var_name) # Add the category as a new column
top_hospitals_df <- rbind(top_hospitals_df, top_hospital)
}
# Display the table with kable, including the Category column
kable(top_hospitals_df,
col.names = c("Top Hospital", "Top Value", "Category"),
caption = "Top Performing Hospitals by Category")| Top Hospital | Top Value | Category |
|---|---|---|
| KAISER FOUNDATION HOSPITAL - SANTA CLARA | 16711025 | Total_African_American |
| STANFORD HEALTH CARE | 18140655 | Total_Hispanic_American |
| KECK HOSPITAL OF USC | 1649217 | Total_Native_American |
| STANFORD HEALTH CARE | 86407777 | Total_Asian_Pacific_American |
| WASHINGTON HOSPITAL - FREMONT | 252433738 | Total_Unknown_Minority |
| WASHINGTON HOSPITAL - FREMONT | 265276375 | Total_Minority |
| KAISER FOUNDATION HOSPITAL - SAN DIEGO - CLAIREMONT MESA | 33411813 | Total_Women |
| CONTRA COSTA REGIONAL MEDICAL CENTER | 30308261 | Total_LGBT |
| KAISER FOUNDATION HOSPITAL - MODESTO | 17341068 | Total_Disabled_Veteran |
Where are the most diverse hopstials located?
Across categories and in total, a majority of hospitals that spend the most money on diverse suppliers residei n the bay area. However, there are a signicant number of hospitals that spend a lot of money suppliers from women and disabled veterans in Southern California. Hospitals in northern California include a few state and University of California hospitals including UC San Francisco and Stanford.
total_minority_palette <- colorNumeric(palette = "Reds",
domain = final_hospitals$Total_Minority)
# Create a leaflet plot for hospitals
leafplot <- leaflet(final_hospitals) %>%
addProviderTiles('CartoDB.Positron') %>%
addCircles(
lat = ~latitude,
lng = ~longitude,
label = ~paste0("Hospital: ", Hospital_Name, "<br>Total Minority: ", Total_Minority),
color = ~total_minority_palette(Total_Minority),
opacity = 1,
fillOpacity = 1,
stroke = FALSE,
radius = 5
) %>%
addLegend('bottomleft',
pal = total_minority_palette,
values = final_hospitals$Total_Minority,
title = 'Total Minority in Hospitals',
opacity = 1) %>%
setView(lng = -119.4179, lat = 36.7783, zoom = 6) # Set view to California
leafplot# Explicitly list the variables to create maps for
selected_vars <- c(
"Total_African_American", "Total_Hispanic_American",
"Total_Native_American", "Total_Asian_Pacific_American",
"Total_Unknown_Minority", "Total_Minority",
"Total_Women", "Total_LGBT", "Total_Disabled_Veteran"
)
create_map <- function(var_name) {
color_pal <- colorNumeric("Purples", domain = final_hospitals[[var_name]], na.color = "transparent")
leaflet(data = final_hospitals) %>%
addProviderTiles("CartoDB.Positron") %>%
setView(lng = -119.4179, lat = 36.7783, zoom = 6) %>%
addCircles(
lat = ~latitude, lng = ~longitude,
color = ~color_pal(final_hospitals[[var_name]]),
opacity = 1, fillOpacity = 0.8
) %>%
addLegend("bottomleft", pal = color_pal, values = final_hospitals[[var_name]],
title = var_name, opacity = 1)
}
# Generate a list of maps for each specified variable
map_list <- lapply(selected_vars, create_map)
# Display all maps in one view
browsable(tagList(map_list))What hosptials make the most money?
Regardless of supplier diversity, the top earning hosptials are in the bay area or Los Angeles/Orange County areas. Cedars_Sinai, LA General, UCLA, and Childrens Hopsital are in the top 10 earning hospitals in California, however this does not mean they spend the most on supplies from minority owned businesses.
Most hospitals in the plot below net the average amount in income every year, with some like Eden Medical Center earning the least amount of income.
# Load the knitr package if it’s not already loaded
library(knitr)
# Prepare the top hospitals table
top_hospitals <- final_hospitals %>%
arrange(desc(NET_INCOME)) %>%
head(n = 10) %>%
select(Hospital_Name, NET_INCOME)
# Display the table with just kable
kable(top_hospitals,
col.names = c("Hospital Name", "Net Income"),
caption = "Top 10 Hospitals by Net Income",
format = "html") # Specify HTML format to ensure compatibility in an HTML document| Hospital Name | Net Income |
|---|---|
| STANFORD HEALTH CARE | 808452386 |
| CEDARS-SINAI MEDICAL CENTER | 570706272 |
| RADY CHILDREN'S HOSPITAL - SAN DIEGO | 522677659 |
| LOS ANGELES GENERAL MEDICAL CENTER | 475854337 |
| EL CAMINO HEALTH | 315951240 |
| RONALD REAGAN UCLA MEDICAL CENTER | 303667992 |
| HOAG MEMORIAL HOSPITAL PRESBYTERIAN | 302767652 |
| SHARP MEMORIAL HOSPITAL | 298370158 |
| CHILDREN'S HOSPITAL OF ORANGE COUNTY | 234476380 |
| KAISER FOUNDATION HOSPITAL - SANTA CLARA | 231096595 |
library(plotly)
# Create a scatter plot for NET_INCOME across all hospitals
net_income_scatter_plot <- ggplot(final_hospitals, aes(x = seq_along(NET_INCOME), y = NET_INCOME, text = Hospital_Name)) +
geom_point(color = "blue", alpha = 0.6) +
theme_minimal() +
labs(
title = "Scatter Plot of Net Income Across All Hospitals",
x = "Hospital Index", # X-axis label indicating each hospital's position
y = "Net Income"
)
# Convert the ggplot to an interactive plotly plot
interactive_plot <- ggplotly(net_income_scatter_plot, tooltip = "text")
interactive_plotWhat are hospitals saying in their commitment to supplier diversity?
Each hospital is also required to report a supplier diversity statement to state their commitment to procuring from diverse suppliers. While not required, a majority of hospitals in this dataset reported a statement. To identify common themes, I highlighted the top 20 unique words and 3-word phrases to identify common themes in supplier diversity statements.
From this analysis suppliers have a strong commitment to procuring from diverse suppliers in order to support the business needs of the hospitals. They also aim to drive some sort of competition, possibly between suppliers to lower prices.
final_hospitals %>%
unnest_tokens(word, Supplier_Diversity_Statement) %>%
filter(!str_detect(word, "\\d")) %>%
anti_join(stop_words, by = c("word"))%>%
count(word, sort=TRUE) %>%
top_n(20, n) %>%
ggplot(aes(x = n, y = fct_reorder(word, n))) +
geom_col()final_hospitals %>%
unnest_ngrams(token, Supplier_Diversity_Statement, n = 3) %>%
count(token, sort=TRUE) %>%
top_n(20, n) %>%
ggplot(aes(x = n, y = fct_reorder(token, n))) +
geom_col()Conclusion
Overall, preliminary data suggest a potential positive relationship between procurement from supplier diversity and increased income for hospitals. This remains a critical priority for hospitals and medical centers throughout California. Notably, some of the highest-earning hospitals are not allocating significant resources to diverse suppliers. Further analysis with greater statistical power is essential to clarify this relationship. It is important to address health equity gaps and combat systemic racism and injustice by actively investing in and procuring from minority-owned businesses.
A potential other area of analysis could also include removing/isolating the amount of money spent on suppliers that are the industry standard. For example, Epic Health Systems is one of the industry leaders in electronic medical record software applications and holds medical records of 78% of patients in the United States. Its CEO and founder is businesswoman Judidth Faulker who is was called “the most powerful woman in healthcare” by Forbes in 2013. n this dataset, as an industry leader, Epic Health Systems would still be categorized as a minority supplier due to Judith Faulkner’s ownership status. This presents an interesting dynamic, as Epic’s influence and widespread use could skew the analysis of diversity spend, highlighting funds allocated to suppliers who are industry giants. Examining the data specifically for minority suppliers who are not major players in their industries could provide valuable insights into how funds are distributed among smaller, potentially emerging minority-owned businesses and the unique challenges they face in achieving industry traction and growth.